智能论文笔记

Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation

Ainesh Bakshi , Piotr Indyk , Praneeth Kacham , Sandeep Silwal , Samson Zhou

分类：机器学习

2022-12-01

Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency -- given $n$ input points, most kernel-based algorithms need to materialize the full $n \times n$ kernel matrix before performing any subsequent computation, thus incurring $\Omega(n^2)$ runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain $\textit{subquadratic}$ time algorithms for several fundamental linear-algebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving linear systems, local clustering, low-rank approximation, arboricity estimation and counting weighted triangles. We build on the recent Kernel Density Estimation framework, which (after preprocessing in time subquadratic in $n$) can return estimates of row/column sums of the kernel matrix. In particular, we develop efficient reductions from $\textit{weighted vertex}$ and $\textit{weighted edge sampling}$ on kernel graphs, $\textit{simulating random walks}$ on kernel graphs, and $\textit{importance sampling}$ on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in $\textit{sublinear}$ (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on low-rank approximation (LRA) and spectral sparsification, where we observe a $\textbf{9x}$ decrease in the number of kernel evaluations over baselines for LRA and a $\textbf{41x}$ reduction in the graph size for spectral sparsification.

translated by 谷歌翻译

我们研究基于Krylov子空间的迭代方法，用于在任何Schatten $ p $ Norm中的低级别近似值。在这里，通过矩阵向量产品访问矩阵$ a $ $如此$ \ | a（i -zz^\ top）\ | _ {s_p} \ leq（1+ \ epsilon）\ min_ {u^\ top u = i_k} } $，其中$ \ | m \ | _ {s_p} $表示$ m $的单数值的$ \ ell_p $ norm。对于$ p = 2 $（frobenius norm）和$ p = \ infty $（频谱规范）的特殊情况，musco and Musco（Neurips 2015）获得了基于Krylov方法的算法，该方法使用$ \ tilde {o}（k）（k /\ sqrt {\ epsilon}）$ matrix-vector产品，改进na \“ ive $ \ tilde {o}（k/\ epsilon）$依赖性，可以通过功率方法获得，其中$ \ tilde {o} $抑制均可抑制poly $（\ log（dk/\ epsilon））$。我们的主要结果是仅使用$ \ tilde {o}（kp^{1/6}/\ epsilon^{1/3} {1/3}）$ matrix $ matrix的算法 - 矢量产品，并为所有$ p \ geq 1 $。为$ p = 2 $工作，我们的限制改进了先前的$ \ tilde {o}（k/\ epsilon^{1/2}）$绑定到$ \ tilde {o}（k/\ epsilon^{1/3}）$。由于schatten- $ p $和schatten-$ \ infty $ norms在$（1+ \ epsilon）$ pers $ p时相同\ geq（\ log d）/\ epsilon $，我们的界限恢复了Musco和Musco的结果，以$ p = \ infty $。此外，我们证明了矩阵矢量查询$ \ omega的下限（1/\ epsilon^ {1/3}）$对于任何固定常数$ p \ geq 1 $，表明令人惊讶的$ \ tilde {\ theta}（1/\ epsilon^{ 1/3}）$是常数〜$ k $的最佳复杂性。为了获得我们的结果，我们介绍了几种新技术，包括同时对多个Krylov子空间进行优化，以及针对分区操作员的不平等现象。我们在[1,2] $中以$ p \的限制使用了Araki-lieb-thirring Trace不平等，而对于$ p> 2 $，我们呼吁对安装分区操作员的规范压缩不平等。

translated by 谷歌翻译

这项工作侧重于分析洗手过程中涉及的手势。世界卫生组织手卫生指南提供的洗手有六种标准手卫生手势。在本文中，使用计算机视觉库OpenCV来提取手中的手和手的轮廓，手的质心和沿着最大轮廓的极端手指。这些手特征在手卫生视频中为每个数据帧提取。在项目中创建了一只稳健的手卫生数据集。本工作中使用此数据集的子集。基于具有交叉折叠验证技术的KNN算法，进一步将提取的手特征进一步分组到类中，用于分类和预测未标记数据的分类和预测。实现了> 95％的平均准确度分数，并证明了具有适当输入值K = 5的KNN算法对于分类是有效的。具有六个不同的手动卫生课程的完整数据集将与KNN分类器一起使用以供将来的工作一起使用。

translated by 谷歌翻译

本文介绍了各种深度学习模型，如例外，Reset-50和Inception v3，用于根据世界卫生组织（世卫组织）指南记录的手工卫生手势的分类和预测。数据集由视频格式的六个手卫生运动组成，聚集了30名参与者。该网络由预先训练的模型组成，具有图像净权重和模型的修改头。在培训25时25时，在分类报告中，在分类报告中实现了37％（七七型），33％（Inception V3）和72％（Reset-50）的准确性。 Reset-50模型明显优于正确的课程预测。通过使用快速处理GPU可以克服主要速度限制以进行未来的工作。一个完整的手工卫生数据集以及其他通用手势，如单手动运动（线性手动;圆形手旋转）将用Reset-50架构和医疗保健工作者的型号进行测试。

translated by 谷歌翻译